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(54) Abstract Title 

Method and apparatus for watermarking digital data 

(57) A watermark in the form of an added message is attached to a digital recording so that a significant 
content of the recording Is completely unchanged by the process in the sense that any reader commonly used 
for such recording will extract from the recording exactly what would have been extracted in the case the 
added message had not been attached. This is done by hiding the added message in the error correcting code 
(ECC) for the significant content of the recording. 
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METHOD AND APPARATUS FOR WATERMARKING DIGITAL DATA 

The present invention generally relates to a method and 
apparatus for preventing counterfeiting of digitally encoded media 
such as audio/visual and computer software compact disks (CDS) and. 
more particularly, to a watermarking technique for digital data. 

Counterfeiting costs billions of dollars yearly to compact disk' 
companies, software companies and other industries around the world. 
Several methods have been proposed to fight against counterfeiting . In 
US patent application Serial No. 09/060, 026, a coded message is 
associated with the combination of the significant content of the disk 
and a serial number on the disk. This coded message is hidden using 
some least significant bits of the recording. However, musicians 
usually consider the standard «-bit technology used to digitize 
musical signals for compact disk recording insufficient to fully 
render the analog music quality. As a consequence, sacrificing a few 
bits, or even some of the least significant bits, is considered 
unacceptable by music producers. It is possible to intertwine the 
musical signal with a coded signal not made audible by the compact 
disk player but, in most obvious implementations at least, this would 
require the use of special disk readers, a solution which is clearly 
unappealing from a commercial point of view. It is also possible to 
choose the bits carrying the authentication code according to some 
model for musical perception in order to minimize the audible effects 
of changing the audio data, but this cannot be expected to be as good 
as keeping the full sixteen bits. It will be appreciated that in 
addition to music data files, other types of data files such as a 
computer program code would better be recorded without any change of 
the significant content. 

The main problem solved by the present invention stems from the 
fact that prior methods of digital watermarking cannot be used for 
several types of applications, since they modify the original data 
set. 

According to one aspect of the present invention there is 
provided a method for attaching an added message to a digital message 
so that the significant content of the digital message is completely 
unchanged comprising the step of hiding the added message in an error 
correcting code for the significant content of the digital message. 



Other aspects are defined in the appended cla 



lms . 



According to the invention, the basic principle is to hide all 
of the authentication data In the error correcting code <ECC) of the 
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digital recording. The method of the invention can be used both to 
guarantee originality and to recognize counterfeiting. In the latter 
case, a serial number may be attached to any recording and serves, 
together with the significant content of the recording, to create the 
protecting code. A counterfeiter can only produce legitimate pairings 
between the serial number and the encoding by copying originals and 
can only duplicate as many unique, verifiable such pairs as he has 
access to. Depending on the size of the watermark, the probability of 
error in the recovered watermark due to read/write errors can be 
reduced by means of a second level of error correction coding. 

The present invention has a significant advantage that it is not 
obvious that any encoding has been used (which is possibly desirable 
in some contexts) and that neither a special reader nor special 
software is necessary at the reading end, except to extract the 
watermark. Thus, besides music, another important application of the 
present invention is provided by data such as computer program code 
where the data is often needed with full precision and. if the format 
is fixed, there is no obvious space usable to embed a protecting code 
to guarantee that the data have not been modified. 

Embodiments of the invention will now be described, by way of 
example only, with reference to the accompanying drawings, in which: 

Figure 1 is a flow diagram describing the general principle of 
error correcting codes (BCC) utilization in recording ; 

Figure 2 is a flow diagram representing the general mechanism of 
the present invention ; 

Figure 3 is a flow diagram representing a first preferred 
embodiment of the present invention; 

Figure 4 is a flow diagram representing a second preferred 
embodiment of the present invention; and 

Figure 5 is a flow diagram representing a third preferred 
embodiment of the present invention. 

By way of background on error correcting codes, consider some 
significant content such as a musical recording, a musical score, a 
set of data, a software code, etc. Such significant content can be 
recorded on an optical** magnetic or other suitable media after being 
digitized. After the digitization the significant content takes the 
form of a binary word (in short, word) Wabibj . . .b„, where b if 
i=l,2,...N, is either 0 or l. This word will still be considered as 
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significant content and, in fact, when we refer to the "significant I 

content" in the sequel, we usually mean a word such as W rather than f 

the original image, music, data set, etc. it represents. In order to ] 

record the significant content on some chosen medium, one will usually f 

not simply record the string of k^s as they appear in W. This is f 

because the recording, the manipulation of the medium, the reading I 

from the medium, and possibly some other factor, all can introduce f 

errors which would, sometimes severely, alter the integrity of the . I 

significant content: for instance the music one would play according I 

to the erroneously read w would be different, often quite different, f 

from the music represented by w. Instead of recording the word W, one f 

uses what are called error correcting codes (ECC) , which have f 

precisely the virtue of allowing some errors in strings of symbols to 1 

occur while still permitting the recovery of the word w, and thereby I 

15 retrieving the significant content. f 



10 



20 



Abstractly, an ECC can be thought of as given by a many- to-one 
map F on the set of finite words. The inverse set F 1 (W) of any word W 
to be coded contains a special subset P(F Mw)> which has the property 
that a small enough number of errors (such as bit changes and/or bit 
omissions and/or spurious extra bit) transform any word in P(F- X (W)) 
into a word of F»(W>. Therefore, if one records a member of the set 
P(F \(W)) one can still recover the word w, even in the face of these 
sorts of errors. "Error Control Coding: Fundamentals and 
Applications", S. Lin and D^J. Costello, Prentice -Hall , 1983, is a 
general reference for the subject of error correcting codes. In 
practice, the ECC can be defined at the level of subwords of limited 
length, and for def initeness, we will limit ourselves to such ECCs, 
although the invention could as well be used for the more general 
3 0 case. 



25 



35 



With reference to Figure 1, there is shown a flow chart of ECC 
with the amount of generality needed in the rest of this description. 

In function block 101, the word W is cut into blocks w 1# w 3 w N , 

where each block is made of some (usually fixed) number of successive 
h s s (i.e., in general, NT^~ smaller than N) . The ECC encoder 102 
associates to each w x some other string of 0s and is denoted by w'j. 
Each w'i is called a codeword. The ordered concatenation W jW a . . . w' H of 
the w',s form the ECC data W' . During transmission or storage, some 
errors might be introduced into w', so that, instead of w' i# a 
corrupted codeword w\ (block 103) is recorded. While the primary 
application of the invention is to get data onto a disk, the data can 
also be transmitted, and the same techniques can be applied to 
digitally encoding data on a transmission medium. That is to say that 
45 such transmission medium typically has some significant content w, 

which is encoded in a particular format through the use of specific 
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error correcting codes {i.e., these are dictated by communi cat ions 
protocols), yielding w' , which is then encoded in a specific optical, 
acoustic, or electrical manner for transmission on a specific medium 
(such as a computer local area network such as ethernet or token ring, 
or through a modem for transmission on a phone line, or through a 
digital telecommunications medium such as ISDN, or via any wireless 
transmission). The present invention can be used to transmit w" 
instead of W . W" again has the property that it can be read back 
through an ECC decoder to yield the original word w, but can also be 
verified for authenticity through the use of a special reader. If 
the number of errors is less than a specific number (which depends on 
the type of ECC used), then the ECC error correction unit corrects the 
errors and returns the uncorrupted codewords w' t . When w' 4 is decoded 
in the ECC decoder 104, the word w> (output 105) is given as the output 
15 of decoder 104 . 

In general, the average length of w' t is bigger than that of 
What really matters is that not only w',, but also any string of 0s and 
Is obtained from w' t by making at most K errors (where K and the type 
20 of allowed errors depends on the chosen ECC (and possibly on w*)) is 

corrected by 104 and decoded by decoder 104 to generate w s . In the 
sequel, we will always assume that K>2, which can be easily achieved 
by any of the currently used ECC, such as the Reed-Solomon code. 

25 wi ^b reference first to Figure 2, some authentication data A 201 

is associated to any data set W 202 to be recorded. The basic 
principle used is to hide all of the authentication data in the error 
correcting code 210 used to perform the recording. Following the 
teaching of US application Serial No. 09/060,026, the authentication 

30 data A ma y be chosen to be such that the triple (A,W,D) at 205, formed 

by the authentication data A, the data set W, and some other auxiliary 
data set D 203 (associated to W or to the physical support of the 
recording) cannot be generated by unauthorized parties. The code used 
for the generation of such triples can be based on (secret or public) 

35 encryption 200. In particular, instead of the general implicit 

relation between A, W and D, the authentication message A may be a 
coded message depending on the pair (W,D), in which case we will often 
write C instead of the generic notation A. Depending on the precise 
implementation, the present invention allows recognizing the 

40 originality and/or legitimacy of recordings in such a way that the 

meaning of what is recorded is completely unaffected by the 
implementation of the invention and standard readers would neither 
detect nor be affected by the implementation of the invention. 

45 We note that using some of the error correcting codes to carry 

authentication data may, in general, reduce the robustness of the 
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error correction scheme. That is, while the original system may have 
been able to correct K errors, a system using our invention may now be 
able to correct K'<K errors. This is not viewed as a significant 
problem, as common delivery media such as the compact disk (CD) are 
capable of correcting far larger numbers of errors than are generally 
required (see, for example, "Phil lips -Sony Red Book" or International 
Electrotechnical Commission standard IEC 60908 (1987-09), for detailed 
information on the Compact Disc standard) . it will be understood by 
those skilled in the art that the CD standard provides certain subcode 
channels, i.e., channels which contain non-audio data. Any of these 
might trivially be used to carry digital signatures or watermarks. 
However, the invention still has utility for storage medium other than 
the standard CD, or for audio CDs in which the subcode channels are 
otherwise in use, as well as in the transmission medium which have no 
unused subcode channels. In fact, in any particular embodiment, it 
should be possible to carefully distribute the authentication data 
across the medium so as to distribute the reduction in robustness as 
appropriate. For example, the robustness of the error correction 
scheme can be compromised in less critical areas of the data set or 
better-protected areas of the physical medium, or evenly distributed 
across the data set to minimize the aggregate impact. Furthermore, 
in certain medium, again such as CDs, in addition to the use of error 
correction codes, the data is interleaved on the physical medium in a 
manner which makes the data recovery process exceptionally robust in 
the face of specific types of errors; e.g., burst-errors, and errors 
located in physically contiguous regions of the medium. Appropriate 
distribution of the additional authentication data can help preserve 
these sorts of robust behaviors. 



30 



A description of the specific cryptographic techniques used 
herein (secret key/public key (SK/PK) pairs and hash functions) can be 
found in Handbook of Applied Cryptography by Alfred J. Menezes, Paul 
C. van Oorschotand and Scott A. Vanstone, CRC Press, 1997. 
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A first embodiment of the invention will now be described with 
reference to Figure 3. The significant content W=WjW 2 . . .w K at input 
block 300, possibly combined with supplementary data D at input block 

301 (such as time, data and/or serial numbers attached physically to 
the recording), is encrypted "by a secret key SI in the encryption unit 

302 to generate a coded message C at 303. The coded message C can be 
represented as some sequence SjS,. . ,s e of 0s and Is. For convenience, 
we will assume that the length c of C is fixed once and for all, but 
this is quite inessential to the invention and other conventions can 
be taken as well. At the FCC encoder 320, the word W is transformed to 
the primary error corrected word W' =w' lW ' , . . . w' M of length M at 321 (in 
general, M is greater than N ) . A defined algorithm A at 304 associates 
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to M a collection ji<j2<. . .<jc<M of addresses of coding blocks 
w 'jx< w 'j2' • • • * w# jc- For instance, one can take the w' n s, with i in 
{1,2,... c}, as evenly distributed along W* . The choice of 
(jl, j2. . . , jc) can be either secret or known publicly. These selected 
5 coding blocks are changed in 305 into according to another 

(possibly secret) key S2, i.e., w' jt is changed into w"j i «S2 (w' jt , W.D.C) . 
We denote by W the ECC transform of W, and by W n the word obtained 
from W by replacing each coding block w' n by w"^. 

10 The word W is what gets recorded at function block 306. When 

read with an ordinary reader 310, w- goes through an ECC decoder 307 
to yield back W if there has not been too many errors. 

To check that the recording is original, one needs a special 
15 reader at 309 which accesses W° and delivers it without passing 

through the FCC decoder 307. The mechanism for reading W" is part of 
commercially available audio-CD and CD-ROM players, and will be 
understood by those skilled in the art. A special reader can be 
constructed by intercepting the signal before it reaches the error 
20 correcting circuitry. 

One can then verify that C is as it should be given significant 
content W and other data D. More precisely because of errors, the C 
one reads with the special reader 309 may be slightly different from 
25 the original C. To verify authenticity of the encoding, one verifies 

that the rate of errors in the coding blocks is of the same order as 
in the rest of the recording. 

Because of the errors which may occur in the coding blocks, 
30 public key encryption cannot be readily adapted to the embodiment 

represented in Figure 3. Two embodiments will therefore be described 
which allow the use of public key encryption as this is often the most 
convenient method to ensure secure and easy verif iability as then 
several agents can verify authentication codes, but far less can 
35 generate them. 

A second embodiment will next be described with reference to 
Figure 4. The significant content W^w^ . . . w N at input block 400 goes 
through a secure, publicly known, hash function at function block 440 

40 to yield a much shorter word Q at output 445. The word Q is then 

possibly concatenated with supplementary data D=x x x a ...xd at input 
block 401 (such as time, data and/or serial numbers attached 
physically to the recording) to form a word Zs^u,. . .Up at output 450 
(if there is no D, Z is just Q) . The word is encrypted by a secret key 

45 SI in the encryption unit 402, the secret key being now chosen as the 

secret part of a secret key/public key (SK/PK) pair, to generate a 
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coded message C of length c at output 403. The coded message C can be 
represented as some sequence s t s 2 ...s c of 0s and Is. 

At the ECC encoder 4 20/ the word W is transformed to the word 
W'=w' 1 w' 2 ...w' M of length M at output 421 (in general, M is greater 
than N) . Next a second ECC'ehcoder at 404, converts C into ECC code 
words C'-tit a ...t e of Iehgth T c'>c at output 405. Note that the second 
error correcting code used at ECC encoder 4 04 does not have to be the 
same as the ones used at ECC encoder 420. To distinguish this ECC 
encoder/decoder pair from the first FCC encoder /decoder pair, we will 
call this the secondary ECC encoder /decoder. 



A defined algorithm A at function block 406 associates to M a 
collection jl<j2<. . .<jc'<M of addresses of coding blocks 

15 w 'iiW ja , . . .w' je . For instance," one can take the w' JiS , with i in 

{1,2, ...c} # as evenly distributed along W . The choice of 
(jl*j2 v ...,jc') should be known publicly, or at least by whomever one 
wants to be able to check the authenticity of the recording. These 
selected coding blocks are changed in function block 407 into 

20 *V»f (t^wjj in such a way that w«can be interpreted as a 0 or a 1 

according to a publicly known rule. Also, the function f is such that 
there is a map g satisfying tj-gO*"^) . We denote by W the ECC 
transform of W, and by W« the word obtained from W by replacing each 
coding block w'^ by w" n . The word W« is such that any reading of the 

25 word w "nw n j2. . .w" jc which is not to much spoiled by errors is 

interpreted as C by running the secondary ECC decoder on the word 

g(W n )g(w» j2 ) s . .g(w 5c ') 



45 



The word W is what gets recorded at 408. When read with an 
ordinary reader 410, W" goes through an ECC decoder at 409 to yield 
back W at output 430 if there has not been too many errors. 

To check that the recording is original, one needs a special 
reader (as described previously) at 460 which accesses w« and delivers 
it without passing through the ECC decoder. 

One can then verify that C is as it should be given W and D, 
using the public part of the SK/PK pair. This check of authenticity 
may be performed by a specialized reader which also outputs the 
significant content W, so that authentication can be performed while 
inspecting W. In case this invention is used to protect retail items, 
the manufacturer may require the retailers selling its brand to use 
only such an authenticating reader when customers want to inspect W. 
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A third embodiment will now be described with reference now to 
Figure 5. The significant content Wsw^.. .w H at input block 500 goes 
through a secure, publicly known, hash function at function block 540 
to yield a much shorter word Q at 545. The word Q is then 
concatenated with an authentication message A«UjU 2 . . .u a at input block 
501 to form a word Z =tjt 2 ,..t p at output 550. The word Z is encrypted 
by the secret part SI of a SK/PK pair in the encryption unit 502 to 
generate a coded message D at output 503 . 

At the ECC encoder 520, the word W is transformed to the word 
W'=w' a w' 2 ...w'„ of length M at 521 (in general, M is greater than N) . 
A defined algorithm A at function block 506 associates to M a 
collection j 1< j2< . . . < jc' <M of addresses of coding blocks 
w ' jw w ' j*. • * • * w ' jc- For instance, one can take the w'^s, with i in 
(1,2, . . .c}, as evenly distributed along W' . The choice of 
(jl, j2, . . jc' ) should be known publicly, or at least by whomever one 
wants to be able to check the authenticity of the recording. These 
selected coding blocks are changed in function block 507 into 
w" Ji =f (u^Wji) in such a way that w" can be interpreted as a 0 or a 1 
according to a publicly known rule. Also, the function f is such that 
there is a map g satisfying Ui^gfw"^) . Again the choice of f and g 
should be known publicly, or at least by whomever one wants to be able 
to check the authenticity of the recording . We denote by W" the FCC 
transform of W, and by W" at 508 the word obtained from W' by 
replacing each coding block w'^ by w« n . A specialized reader can 
extract the word Z from the recording. 

Because of errors, what is actually read (assuming the recording 
is authentic) will be a approximation Z' to Z, this approximation 
being close if there are not yet too many errors. The word D is also 
attached to the recording at 550 (for instance in the form of a bar 
code on the physical support of the recording) and one checks 
authenticity by verifying that Z' is close enough to the word Z 
extracted from D by the public part of the SK/PK pair. 

Thus has been described a method and apparatus which permits 
recognition whether a recording is original and/or if it has been 
performed by the legitimate originator. Furthermore, the method and 
apparatus provide a way to authenticate a digital recording where no 
significant bit of the recording can be modified for purposes of the 
authentication. Still further, the described watermarking technique 
protects the recordings from counterfeiting while avoiding the need 
for special apparatus for reading the recordings. 

While the invention has been described in terms of three 
preferred embodiments, those skilled in the art will recognize that 
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the invention can be practiced with modification within the scope of 
the appended claims. \ 
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CLAIMS 

1. A method for attaching an added message to a digital message so 
that the significant content of the digital message is completely 

5 unchanged comprising the step of hiding the added message in an error 

correcting code for the significant content of the digital message. 

2. The method of claim 1, further comprising the step of encrypting 
at least a portion of the significant content to generate the added 

10 message. 

3. The method of claim 2, wherein the step of encrypting uses a 
public key encryption method. 

15 4 - Th « method of any preceding claim, wherein two or more layers of 

error correction are used in the error correcting code. 

5. The method of any preceding claim, wherein a second data set is 
attached to the digital message. 

20 

6. The method of claim 5, wherein the second data set is attached 
to a physical support of the digital message. 

7 . The method of claim 6 further comprising the step of reading 
25 said added message to check that the physical support of the digital 

message has not been counterfeited. 

8. The method of claim l, further comprising the step of reading 
said added message to check that the significant content of the 

30 digital message is authentic. 

9. The method of any preceding claim wherein the digital message is 
stored in a recording. 

35 10. The method of claim 9 further comprising the steps of: 

decoding the error correction code; and 

reading the added message to check that the physical support of 
40 the recording has not been counterfeited. 

11. The method of claim 9 further comprising the steps of: 

decoding the error correction code; and 

45 
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reading said added message to check that the significant content 
of the recording is authentic. 

12 . The method of claim 1 further comprising the step of 
transmitting the added message and the significant content. 

13. The method of claim l wherein the digital message is transmitted 
over a transmission medium, and said added message is hidden in the 
error correcting code specific to said transmission medium. 

14. A method for attaching an added message to a digital recording 
so that a significant content of the digital recording is completely 
unchanged, comprising the steps of : 

selecting an added message that is to be attached to the 
significant content ; 

associating the added message with the significant content; 
selecting an error correction code for the significant content; 

and 

hiding the added message within the error correction code. 

15. The method of claim 14 wherein the digital message is stored in 
a recording. 

16. The method of claim 14 wherein the digital message is 
transmitted over a transmission medium. 

17. Digital recording apparatus including: 

means for recording a significant content onto a recording 
medium ; 

means for associating an added message with said significant 
content ; 

a means for selecting an error correction code for the 
significant content; and 

a means for hiding the added message within said error 
correction code. 



18. Data processing apparatus compris 



mg : 
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means for generating a digital message with a significant 
content ; 

means for associating an added message with said significant 
content ; 

means for selecting an error correction code for the significant 
content ; 



10 



15 



30 



means for hiding the added message within said error correction 
code ; and 

means for transmitting the added message and the significant 
content . 



19. A program storage device readable by a machine, tangibly 
embodying a program of instructions executable by the machine to 
perform method steps for attaching an added message to a digital 
recording so that a significant content of the digital recording is 

20 completely unchanged, said method step comprising hiding the added 

message in an error correcting code for the significant content of the 
digital recording. 

20. A program storage device readable by a machine, tangibly 
25 embodying a program of instructions executable by the machine to 

perform method steps for attaching an added message to a digital 
recording so that a significant content of the digital recording is 
completely unchanged, said method steps comprising: 



associating an added message with the significant content; 
selecting an error correction code for the significant content; 



and 



35 hiding the added message in an error correcting code for the 

significant content of the recording. 
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